1 |
Perturbation and Pitch Normalization as Enhancements to Speaker Recognition
|
|
|
|
In: DTIC (2009)
|
|
Abstract:
This study proposes an approach to improving speaker recognition through the process of minute vocal tract length perturbation of training files, coupled with pitch normalization for both train and test data. The notion of perturbation as a method for improving the robustness of training data for supervised classification is taken from the field of optical character recognition, where distorting characters within a certain range has shown strong improvements across disparate conditions. This paper demonstrates that acoustic perturbation, in this case analysis, distortion, and resynthesis of vocal tract length for a given speaker, significantly improves speaker recognition when the resulting files are used to augment or replace the training data. A pitch length normalization technique is also discussed, which is combined with perturbation to improve open-set speaker recognition from an EER of 20% to 6.7%. ; See also ADM002314. IEEE International Conference on Acoustics, Speech and Signal Processing (34th) held in Taipei, Taiwan on 19-24 April 2009. U.S. Government or Federal Purpose Rights License, The original document contains color images.
|
|
Keyword:
*PERTURBATIONS; *SPEECH ANALYSIS; Acoustics; CLASSIFICATION; COMPONENT REPORTS; PITCH NORMALIZATION; RECOGNITION; SOUND PITCH; SPEAKER RECOGNITION; SPEECH; SPEECH SYNTHESIS; SYMPOSIA; Voice Communications
|
|
URL: http://oai.dtic.mil/oai/oai?&verb=getRecord&metadataPrefix=html&identifier=ADA555288 http://www.dtic.mil/docs/citations/ADA555288
|
|
BASE
|
|
Hide details
|
|
2 |
Automating Convoy Training Assessment to Improve Soldier Performance
|
|
|
|
In: DTIC (2008)
|
|
BASE
|
|
Show details
|
|
3 |
Cross-Cultural Nonverbal Cue Immersive Training
|
|
|
|
In: DTIC (2008)
|
|
BASE
|
|
Show details
|
|
5 |
The 2005 AFRL/HEC One-Speaker Detection Systems
|
|
|
|
In: DTIC (2006)
|
|
BASE
|
|
Show details
|
|
6 |
Enhancing Mental Readiness in Military Personnel
|
|
|
|
In: DTIC (2006)
|
|
BASE
|
|
Show details
|
|
7 |
Towards a Formal Ontology for Military Coalitions Operations
|
|
|
|
In: DTIC (2005)
|
|
BASE
|
|
Show details
|
|
8 |
Comment ameliorer la selection et le traitement des messages verbaux? (How to Improve the Selection and Processing of Verbal Messages)
|
|
|
|
In: DTIC (2005)
|
|
BASE
|
|
Show details
|
|
9 |
Evaluation of Speech Synthesis Systems using the Speech Reception Threshold Methodology
|
|
|
|
In: DTIC (2005)
|
|
BASE
|
|
Show details
|
|
10 |
Initial Kernel Timing Using a Simple PIM Performance Model
|
|
|
|
In: DTIC AND NTIS (2005)
|
|
BASE
|
|
Show details
|
|
11 |
Speech Intelligibility with a Bone Vibrator
|
|
|
|
In: DTIC (2005)
|
|
BASE
|
|
Show details
|
|
12 |
Speech Intelligibility with Acoustic and Contact Microphones
|
|
|
|
In: DTIC (2005)
|
|
BASE
|
|
Show details
|
|
13 |
Objective Measurement of the Speech Transmission Quality of Vocoders by Means of the Speech Transmission Index
|
|
|
|
In: DTIC (2005)
|
|
BASE
|
|
Show details
|
|
14 |
Analysis of Free-Form Battlefield Reports with Shallow Parsing Techniques
|
|
|
|
In: DTIC AND NTIS (2004)
|
|
BASE
|
|
Show details
|
|
15 |
Representational and Inferential Requirements for Diagrammatic Reasoning in the Entity Re-Identification Task
|
|
|
|
In: DTIC AND NTIS (2004)
|
|
BASE
|
|
Show details
|
|
16 |
The Case for Using Semantic Nets as a Convergence Format for Symbolic Information Fusion
|
|
|
|
In: DTIC AND NTIS (2004)
|
|
BASE
|
|
Show details
|
|
17 |
Electronic Information Management and Intellectual Property Rights
|
|
|
|
In: DTIC AND NTIS (2004)
|
|
BASE
|
|
Show details
|
|
18 |
Ontological Approach to Military Knowledge Modeling and Management
|
|
|
|
In: DTIC AND NTIS (2004)
|
|
BASE
|
|
Show details
|
|
19 |
Story Link Detection and New Event Detection are Asymmetric
|
|
|
|
In: DTIC (2003)
|
|
BASE
|
|
Show details
|
|
|
|